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Abstract — A single-source network is said to be memory-free if 
all of the internal nodes (those except the source and the sinks) 
do not employ memory but merely send linear combinations of 
the incoming symbols (received at their incoming edges) on their 
outgoing edges. Memory-free networks with delay using network 
coding are forced to do inter-generation network coding, as a 
result of which the problem of some or all sinks requiring a large 
amount of memory for decoding is faced. In this work, we address 
this problem by utilizing memory elements at the internal nodes 
of the network also, which results in the reduction of the number 
of memory elements used at the sinks. We give an algorithm 
which employs memory at the nodes to achieve single-generation 
network coding. For fixed latency, our algorithm reduces the 
total number of memory elements used in the network to achieve 
single-generation network coding. We also discuss the advantages 
of employing single-generation network coding together with 
convolutional network-error correction codes (CNECCs) for 
networks with unit-delay and illustrate the performance gain 
of CNECCs by using memory at the intermediate nodes using 
simulations on an example network under a probabilistic network 
error model. 

I. Introduction 

Network coding was introduced in [1] as a means of 
achieving maximum rate of transmission in wireline networks. 
An algebraic formulation of network coding was discussed in 
[2] for both instantaneous networks and networks with delays. 
Convolutional network-error correcting codes(CNECCs) were 
introduced for acyclic instantaneous networks in [3] and for 
unit-delay, memory-free networks in [4]. 

In this work, we consider acyclic, single-source networks 
with delays which have a multicast network code in place. The 
set of all code symbols generated at the source at any particular 
time instant is called a generation. In unit-delay, memory-free 
networks, the nodes of the network may receive information 
of different generations on their incoming edges at every time 
instant and therefore network coding across generations (inter- 
generation) is unavoidable in general. However, the sinks 
have to employ memory to decode the symbols. If memory 
is utilized in the internal nodes also, such inter-generation 
network coding can be avoided thus making the decoding 
simpler. 

We define a single-generation network code as a network 
code where all the symbols received at all the sinks are linear 
combinations of the symbols belonging to the same generation. 
In [5], the technique of adding memory at the nodes to achieve 
single-generation network coding was discussed. However this 
was done only on a per-node basis without considering the 
entire topology or the network code of the network. On the 
other hand, we consider the entire network topology and 



the network code, which govern the addition of memory 
elements at the nodes and the way in which they are rearranged 
across the network to reduce the overall memory usage in the 
network. 

The organization and contributions of this work are as 
follows 

• After briefly discussing the network setup and the net- 
work code for an acyclic network with delays and mem- 
ory (Section II), we introduce different methods of adding 
memory at a node and analyze how each of them affect 
the local and global encoding kernels of the network code 
(Section III). 

• We also present different memory reduction and distribu- 
tion techniques (Section IV). 

• We propose an algorithm which uses the memory at 
the nodes to achieve single-generation network coding 
while reducing the overall memory usage in the network 
(Section V). 

• We discuss the advantages of employing memory at the 
intermediate nodes in tandem with CNECCs in terms of 
their encoding/decoding (Section VI). 

• We illustrate the the performance benefits by using 
memory for CNECCs for unit-delay networks using 
simulations on an example unit-delay network under a 
probabilistic error setting (Section VII). 

II. Networks with delay and memory 

The model for acyclic networks with delays considered in 
this paper is as in [2]. An acyclic network can be represented as 
an acyclic directed multi-graph (a graph that can have parallel 
edges between nodes) Q = (V, E) where V is the set of all 
vertices and £ is the set of all edges in the network. 

We assume that every edge in the directed multi-graph 
representing the network has unit capacity (can carry utmost 
one symbol from F q , the field with q elements). Network 
links with capacities greater than unit are modeled as parallel 
edges. The network has delays, i.e, every edge in the directed 
graph representing the input has a unit delay associated with it, 
represented by the parameter z. Such networks are known as 
unit-delay networks. Those network links with delays greater 
than unit are modeled as serially concatenated edges in the 
directed multi-graph. We assume a single-source node s G V 
and a set of sinks T. Let n T be the unicast capacity for a sink 
node T G T i.e the maximum number of edge-disjoint paths 
from s to T. Then 

"mm = min n 
TeT 

is the max-flow min-cut capacity of the multicast connection. 
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A. Network code for unit-delay, memory-free networks 

We follow [2] in describing the network code. For each node 
v <E V, let the set of all incoming edges be denoted by Tj(v). 
Then = Sj(v) is the in-degree of v. Similarly the set 

of all outgoing edges is defined by To(v), and the out-degree 
of the node v is given by ro(v)| = 6o(v). 

For any e G £ and v G V, let head(e) = v, if v is such 
that e G ^i(v). Similarly, let tail(e) = v, if v is such that 
e G To(v). We will assume an ancestral ordering on V and £ 
of the acyclic graph of the unit-delay, memory-free network. 

The network code can be defined by the local kernel 
matrices of size Si(v) x So(v) for each node v G V with 
entries from ¥ q . The global encoding kernels for each edge 
can be recursively calculated from these local kernels. 

The network transfer matrix, which governs the input-output 
relationship in the network, is defined as given in [2] for an 
n-dimensional (n < n min ) network code. Towards this end, 
the matrices A,K,md £> T (for every sink Tel) are defined 
as follows. 

The entries of the n x \£\ matrix A are defined as 

A = f ov 3 if e 3 G r (s) 
[ otherwise 

where on^ j G ¥ q is the local encoding kernel coefficient at 
the source coupling input i with edge ej G ro(s). 

The (i, j) th entry of the \£\ x \£\ matrix K is K e ^ £] G ¥ q 
which is the local kernel coefficient between ei and e 3 at the 
node head(ei) — tail(ej) (if such a node exists), and zero if 
head(ei) ^ tail(ej). 

For every sink T G T, the entries of the \£\ xn matrix B T 
are defined as 




otherwise 



where all e ei ,» G ¥ q . 

For unit-delay, memory-free networks, we have 

F(z) := (I-zK)- 1 

where I is the \£\ x \£\ identity matrix. Now we have the 
following definition. 

Definition 1 ( [2]): The network transfer matrix, Mt(z), 
corresponding to a sink node T G T for a ra-dimensional 
network code, is a full rank (over the field of rationals ¥ q (z)) 
n x n matrix defined as 

M T (z) := AF(z)B T = AF T (z). 

With an n-dimensional network code, the input and the 
output of the network are n-tuples of elements from F 9 [[z]], 
the formal power series ring over ¥ q . Definition 1 implies 
that if x(z) G F™[[z]] is the input to the unit-delay, memory- 
free network, then at any particular sink T G T, we have the 
output, y(z) G F"[[z]], to be 

y(z) = x(z)M T (z). 



B. Network code for networks with delay and memory 

We define the instantaneous counterpart of a unit-delay 
network as follows. 

Definition 2: Given a unit-delay network Q(V,£), the net- 
work obtained from Q (having the same node set V and the 
same edge set £) by removing the delays associated with the 
edges is defined as the instantaneous counterpart of Q(V,£). 

Example 1: Fig. 1 illustrates an example. A modified but- 
terfly unit-delay network (top) and its instantaneous counter- 
part (bottom) are shown. The global kernels of the incoming 
edges to the sinks T\ and corresponding to a 2 dimensional 
network code are indicated for both networks. 




Fig. 1. The figure corresponding to Example 1 (A unit-delay network and 
its instantaneous counterpart). 

Let Q m (V, £ ) be a single-source, acyclic network with every 
edge of the network having some delay (a positive integer) and 
with memory elements at the nodes available for usage. If none 
of the memory elements at the nodes are used, then we can 
model Gm as a unit-delay, memory-free network Q u . Let Ginst 
be the instantaneous counterpart of Gu ■ The following lemma 
ensures the equivalence of a network code between Ginst and 
Gu- 

Lemma 1 ( [4] ): Let G'(V,£) be a single-source acyclic, 
unit-delay, memory-free network, and G[ nst be the instanta- 
neous counterpart of G' ■ Let Af be the set of all Sj(v) x So(v) 
matrices V v G V, i.e, the set of local encoding kernel matrices 
at each node, describing an m-dimensional network code (over 
¥ q ) for G'i nst ( m — min-cut of the source-sink connections in 
Ginst)- Then the network code described by Af continues to 
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be an m-dimensional network code (over ¥ q (z)) for the unit- 
delay, memory-free network Q' . 

If the nodes use memory elements such that inter-generation 
network coding is prevented at any particular node of the 
network, then this leads to single-generation network coding 
in the network. 

In Section V we give an algorithm which uses memory 
elements at the nodes to achieve single-generation network 
coding, i.e, the network transfer matrix Mt(z) of every sink 
TeTin the in Q m becomes 

M T (z) = z Lt M t (1) 

where Lt is some positive integer and Mt is the network 
transfer matrix of the sink T in Ginst- Clearly, if Mt is full 
rank (over ¥ q ), so is M T {z) (over ¥ q (z)). 

III. Memory additions at a node 

For the source node s, let Ti(s) denote the set of n virtual 
incoming edges which denote the n inputs. The global kernels 
of these edges are therefore the columns of an n x n identity 
matrix over ¥ q , the field over which the network code is 
defined. For every non-source node v £ V, let Tj(v) = <\>. 
For a sink T £ T, let T (T) denote n virtual outgoing edges 
denoting the n outputs at sink T. The global kernels of these 
edges are the columns of the network transfer matrix Mt(z). 
For every non-sink node v £ V, let To(v) — (f). We then 
define the set £ as 

£:=£ ufj(«)U ( |J f (T) j 
\TeT ) 

The ancestral ordering on £ can then be extended to an 
ancestral ordering on £ . 

For any e i} ej £ £ such that head(ei) = tail(ej) = v £ V, 
with memory being used at v, the local kernel A euCj (the 
kernel coefficient between ej £ ^i(s) and ej £ T (s) with 
s = v), K euej or B^. e . (the kernel coefficient between 
and ej £ To( v ) f° r some sink node v) can have elements 
from ¥ q (z). We show in Section V that using the memory 
elements at the nodes according to Subsection III-A and 
Subsection III-B is sufficient to guarantee single-generation 
network coding at each node and therefore in the given 
network. 

A. Adding memory at a node for a pair of an incoming and 
an outgoing edge 

For any ej, ej £ £ such that head(ei) = tail(ej) = v £ V, 
we define M ei . e as the number of memory elements utilized 
at the node v to delay the symbols coming from the incoming 
edge ei (before any network coding is performed at node v 
on the symbols from ej) such that the local kernel between 
and ej is modified in one of the following ways 

A ei , e . .— » z M '*-'i Ae iiei if e t £ fj(s), ej £ £ (2) 

K e ^ £] .— » z M ^i K euej if ei , ej £ £ (3) 

B v ei , ej z M ^Bl ue . ife i ££,e j £f (v) (4) 

while none of the other local kernels are changed. The matrix 
F(z) = (I — zK)^ 1 is also correspondingly modified. 



B. Adding memory at a node for an outgoing edge 

For ej £ T {v) U T (v), we define M ejMU{ej) as the 
number of memory elements added at node v to delay the 
symbols going into the edge ej after performing network 
coding at v. In such a case, the elements of the matrix K 
(or of the matrix or A, or B v ) are modified according to the 
following rule. 

A euej .— z M °^° il ^A euej Ve, £ r 7 , e », if v = s 

(5) 

K ei , ej z M ^^^K euej Ve, £ T ItCj {v), (6) 

if v ^ s, ej £ To(v) 

Bl. ej — z M ^^Bl uej Ve, £ r 7 , ei ( V ),if ej £ f (v) 

(7) 

where the set Yi, ej (v) C r/(u) U f/(u) is defined as in the 
top of the next page. The elements of the matrix F(z) are also 
correspondingly modified. 

Example 2: Fig 2 illustrates an example of the memory 
additions at a node. The memory elements indicated inside 
the box labeled 'A' are added at the node for the pair of edges 
ej and ej thereby delaying the symbols on ej before network 
coding at the node, i.e, M ei , ej — 2. Similarly the memory 
element indicated by 'C is added for the pair of edges ej and 
e fc , i.e, M euCk = 1. The memory element indicated by 'B' 
is added for the outgoing edge ej after network coding, i.e, 



f \ 




Fig. 2. The figure corresponding to Example 2 (Adding memory at a node). 

IV. Memory reduction and distribution 

TECHNIQUES 

In this section, we look at techniques to reduce the memory 
used at the nodes of the network and the overall memory used 
in the network and also to obtain a fairly uniform memory 
usage distribution throughout the network. 

We define the maximum number of memory elements added 
to delay the symbols coming from an edge ei £ £ into node 
head(ei) — v as 

M euhead{ei) . max := max Me l ,e J (9) 

ej eTo, Ci (v) 
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r /;6j (v) := {e, e rj(«) I ^ ei , ej ^ 0} |J {e, G f /(«) | A ei , e . ^ o} . (8) 



r , £l («) := {e, € T (v) | K euej 0} |J {e, G f (t>) | B£ >ej ^ o} (10) 



where To, ei {v) is defined as shown at the top of the next page. 
We define the total number of memory elements used at node 

v as 



A. Memory reduction in a single node 

Consider a node v G V in which memory elements have 
been added to delay symbols coming from an edge ej G 
r 7 (u)uf 7 (u). 

Then, retaining the M e4iheod(e4))Tnoa .(as defined in (9)) 
memory elements, all other memory elements placed on 
can be removed without any change in any local or global 
kernels by tapping symbols from the M e . ,head(ei),max memory 
elements wherever necessary. Doing this for every incoming 
edge of v is equivalent to obtaining a minimal encoder (one 
with minimum number of memory elements) of the transfer 
function (input-output relationship) at node v. 

Example 3: Fig. 3 illustrates a particular example of such 
a reduction. The figure on the top (all at G W q ) represents 
a node v before memory reduction with M v = 3, while the 
figure on the bottom is the same node after memory reduction 
with M v = 2. 

B. Memory reduction between nodes 

For a set of edges £' C £ , let Vc be the set of all nodes 
defined as follows 

Ve> = {head( ej ) e s G £'} (11) 

We now define M e . Mad{e .^ min and M £ , as follows. 

M eiMad{ei)jmin := mm Mei,ej (12) 
ej-ero.ejt") 

M £ , := mmM e . Mad(e . )tmin (13) 

e 3 

where To, ei (v) is as defined in (10). 

For a node v G V, we define the set of adjacent nodes of v 
as the set of nodes 

£ v :— {v' | v' — head(ej) Vej G To(v)} . 

1 ) Memory reduction between adjacent nodes: For a node 
v G V, and for some T' (v) C T (v) U f (v), let T'j(v) C 
r 7 (u) U fj(u) be defined as 

r»= |J r 7 , ei ( V ). 

ei er^(t>) 

where Tj^ ej (v) is as in (8), i.e, the global kernels of the edges 
in ej G F'o( v ) ^ linear combinations of the global kernels of 
the edges in T'j(v) only and none else. Also let M r ^(„) and 




Fig. 3. The figure corresponding to Example 3 (Memory reduction at a 
node). 



the set Vr' ( v ) Q £v of nodes be defined for the set of edges 
T" (v) as in (13) and (11) respectively. 
We define the term M e . T i o ^ as 

M eu r' (v) = max {o, M r , o(v) - M etMad(e . hmax } (14) 
Then, if the condition is satisfied, 

£ M eiiT , o(v) < M r , o{v) \T' (v)\ (15) 

then all of the \T' (v)\M T i o ^ used at the nodes Vt' (v) ( t0 
delay symbols coming from the edges ej G T' (v)) can be 
'absorbed' into node v by removing all these memory elements 
and adding M euT ' ( v ) memory elements at node v for every 
ej G r' 7 (w) (and thereby used for delaying the symbols coming 
from every G T'j (v)), without using any additional memory 
and without changing the global kernels of any outgoing edge 
of any node in Vr' ( v )- 

This technique of 'absorption' of the memory elements from 
a set of nodes which are the 'heads' of the outgoing edges from 
a node v, to the node v itself, is beneficial in terms of reducing 
the overall memory usage of the network (to achieve single- 



e»er 7 (i;)uf e 3 er (f)ufo(«) 
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generation network coding) if the condition (15) is satisfied as 
a strict inequality. 

Example 4: Fig. 4 illustrates an example for memory re- 
duction between multiple nodes (v\ ,V2,v^ and V4 here) of a 
network. Here M T > o{v) = l,\T' (v)\ = 3, and M eii r^ (l)) 



M, 



e2,T' (v) 



1. Therefore, three memory elements at nodes 



v 2 ,v 3 and V4 are 'absorbed' into two memory elements at 
node vi . The boxes indicate the use of memory elements and 
the node to which the memory elements are attached. 




Fig. 4. The figure corresponding to Example 4 (Memory reduction between 
adjacent nodes). 

2) Memory reduction between nodes not necessarily adja- 
cent: For £i,£o C £ being two sets of edges, we say that 
they form a pair [£ j , £0] if 

£1= |J T Itej (tail(ej)). 
and 

£0 = IJ T , ei {head(ei)). 

We say that the sets £j, £0 form a pair [£/, £0 ) if 
£1= |J r J;ej (iaiZ( ej )). 

and 

£ o C |J r , ei (/iearf(ei)). 

For a node v, we define the set P v as follows 

P„:={[r Ji (u),r 0i (u)] I !<»<«„} 



such that the following conditions are satisfied 

r/.Wnr^M = <j>, v 1 < i,j < «„, i^j (16) 
r ,Wnr 0) H = 0, v 1 < i,j < «„, i^j (17) 

where s,j is the maximum number of sets satisfying conditions 
(16) and (17). Algorithm 1 shown at the top of the next page 
obtains the set P v for some node v. 

Example 5: Fig. 5 illustrates a node v with the local kernel 
matrix over some field ¥ q . For this node, the set P v is given 
as 

Pv = {\Ti 1 (v),To 1 (v)],\r Ia {v),r 0a {v)]} 



where 



T h( v ) ={ei,e 2 ,e 3 } 

r/»={e 4 } 



r »={e 5 } 

r o 2 («) ={e6,e 7 ,e 8 } 
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Fig. 5. The figure corresponding to Example 5 which gives the set P v of 
the node v. 

For an pair of edge-sets [Ti i (v),To i (v)] £ P v > we define 
Si(v), a sequence of pairs of edge-sets as 

Si{v) '. — \_£i m j £i m -i ) -i \_£i m -l J 1 ■■■! [^»2 1 ! [^»i ' ^01] 

(18) 

where [£j i; £ 0l ] = [Tj i (v), (v)] , and m is the maximum 
length of the sequence, that is possible to be obtained as in 
(18) for the edge-set pair \r h (v),T 0i {v)] . 
Let k be an integer such that 

\£i„\ = mm \£ij\- 

l<j<m 

For the set r 0i (i;), let M To . {v) be defined as in (13), and 



the set of nodes Vi 



ro 4 (v] 



be defined as in (11). Let the set 



of nodes Vg ik be defined as in (11) for the set £ ik . Also, let 
M ei t v . (v) be defined as in (14) for the set T 0l (v) and for 
an edge e ik e £ ik . As in the memory reduction procedure of 
adjacent nodes, if 



E 



M eik ,r 0i{v) <M, 



r 



,(v)\T 0i (v)\ 



(19) 



then the \T ,i(v)\M ro .(„) used at the nodes Vr .(„) (to 
delay symbols coming from the edges ej <G Tqa{v)) can be 
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Input: A node v eV with the edge sets Ti(v) U Ti(v) and To(v) U To(v). 
Output: The set P v for the node v. 

1 Let i = 1, Oui(u) = r (u) U f (u), = <j>. 

2 repeat 

Let T Ii {v)=T 0i (v)=<t>- 
For some ej G Out(v), let T/^w) = rj )ej (t>) 
repeat 
Let 

r 0< («)= |J r °.««(") 



Let 



until f/ze se/s T^ (v) and Toi {v) remain unchanged for 2 consecutive iterations ; 

-LeiP v = P v VJ{[T h (v),T 0z {v)]}- 
Let Out(v) = Out(v)\T 0i {v) and i = i + 1. 
n until Out(v) — (f> ; 

Algorithm 1. Algorithm to obtain the set P„ for a node v. 



removed without changing the global kernels of the edges of 
T (v'), Vd'6 Vr 0i („) by adding M eifc )ro . (t)) memory ele- 
ments for each edge e ifc g fj fe at the node head{ei k ) G Vg ifc . 
This technique will save memory if the condition (19) is 
satisfied as a strict inequality. 

Example 6: Figure 6 illustrates an example for the mem- 
ory reduction procedure between non-adjacent nodes. Let 
K ei>ej ^ 0, V 9 < i < 12, 13 < j < 15. In the 
example, for the node V3, the set P V3 and the sequence S\(v3) 
corresponding to the only element of P V3 are given by (20) 
and (21) at the top of the next page. 

Now, we have M To i(v3) = 1, |r ,i(w 3 )| = 3, £ ik = {ei} 
and M ei r a 1 t V3 ) = 1- Therefore, the 3 memory used for the 
edges in To, 1(^3) at the nodes W4,«5, and w 6 are 'absorbed' 
into a single memory element used at node v\ for edge e\, 
thus reducing the memory usage by 2. 

Remark 1: The memory reduction procedures of Subsub- 
section IV-B1, and Subsubsection IV-B2 can sometimes result 
in exactly the same memory reduction event. However, there 
could be instances in which only one of the procedures can 
achieve memory reduction. 

For example, the memory reduction procedure of Subsub- 
section IV-B1 cannot reduce memory at node v$ in the situ- 
ation shown in Example 6 because for any T' (v) C To(v), 
\T'j(v)\ > 3 > \T' Q (v)\, since T'j(v) = T^v). However the 
memory reduction procedure of Subsubsection IV-B2 does 
work as shown in Fig 6. 

Similarly, in some cases, at a node, the procedure of 
Subsubsection IV-B1 can be used to reduce memory usage, 
while Subsubsection IV-B2 cannot be applied. This is because 
of the fact that, at any node, the procedure of Subsubsection 
IV-B2 takes into account only those sets of the form P v , while 
the procedure of Subsubsection IV-B1 takes into account all 
possible incoming and outgoing edges. Such a case is seen in 
Example 7. 



Example 7: Fig. 7 shows the node v of Fig. 5 (Example 5) 
in a particular configuration. The memory reduction procedure 
of Subsubsection IV-B2 cannot be applied for the set To,2(v) 
because M To 2 („) = 0. 

But M/ e6;e7 \ = 1, and therefore 2 memory elements at node 
v\ and i>2 can be absorbed into a single memory element at 
node v, thereby facilitating memory reduction according to 
Subsubsection IV-B1. 




I 




Fig. 7. The figure corresponding to Example 7. The box with the incoming 
edges ei,e2,e3, and represents the node v of Fig. 5 (Example 5). 
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p v 3 = { [ r /i(«3) = {e9,eio,eii,ei 2 },r 0l (v3) = {ei 3 , e 14 , e 15 }} }. 



(20) 



Si(v 3 ) = [{ei},{e 2 ,e 3 }), [{e 2 , e 3 } , {e 5 , e 6 , e 7 , e 8 }] , [{e 5 , e 6 , e 7 , e 8 } , (v 3 )] , [T h (v 3 ),T 0l (v 3 )] (21) 




Fig. 6. The figure corresponding to Example 6 (Memory reduction between non-adjacent nodes). 



C. Memory distribution 

The following technique can be used to distribute memory 
elements throughout the network in a somewhat uniform way. 
Suppose there exists a node v <G V such that for some 



ej e To( w ) with v' 

m f~: -Wej ,head(ej),mim 



head(ej) and for some integer 



M„ + to < M v i 



(22) 



then the to memory elements at node v' used to delay symbols 
coming from edge ej can be 'absorbed' into node v (thereby 
using them to delay symbols going into edge ej) without 
changing the global kernels of any edge in To(v'). 

This technique reduces the number of memory elements 
used at node v' for delaying its incoming symbols while 
increasing the number (M e . jaii( ej )) of memory elements used 
at node v for delaying its outgoing symbols. 

Example 8: Fig 8 illustrates an example for memory dis- 
tribution between two nodes V\ and v 2 - In the figure on the 
top, to = 1, M V1 = 0, and M V2 = 3. Therefore one memory 
element from u 2 (used to delay symbols coming from ej into 



V2) can be 'absorbed' into node v\ (and thereby used to delay 
symbols going into ej from v\). The boxes indicate the node 
to which the memory elements are attached. After distribution, 



1, and M„ 



2. 



V. Single-generation network coding - Algorithm 

This section presents the main contribution of this paper. 

For an edge e, G £, let f ei (z) £ represent the global 

kernel of ej. We say that a node v e V\ {s} is a coding node 
if the global kernel of at least one of its outgoing edge is a 
V q (z) linear combination of the global kernels of at least two 
of its incoming edges. Otherwise, we call v a forwarding node. 

Let Vcod be the set of coding nodes, and Vf w d be the set 
of forwarding nodes. Let V° od be the set of all coding nodes 
such that there exist no path in the network from any other 
coding node to any node in V° od . 

Towards proposing an algorithm to enable single-generation 
network coding, we make some observations and discuss the 
addition of memory elements at the coding nodes to achieve 
single-generation network coding. 
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Fig. 8. The figure corresponding to Example 8 (Memory distribution). 



Observation 1: For any v G V® od , the global kernel of any 
e G Tj(v) is of the form 

/»(*) = *' e /e ( 2 3) 

for some positive integer l e , with / e G F™. If the network is a 
unit-delay network and the node v uses no memory, the global 
kernel of any ej G To(v) is of the form 

f ej (z)= J2 **We,(*) = E *W" +1 /e< 

e«er/('u) eier 7 (t>) 

(24) 

where l Ci is a positive integer signifying accumulated delay 
from the source to edge a, and K eue . G ¥ q signifies the local 
kernel coefficient between a and ej. The additional z is to 
account for the delay in the unit delay network. 

A. Single- gene ration processing at the nodes 

For every pair of edges ej,ej' G T^ e .(v) (Tj, ej (v) being 
as in (8)) in (24) such that l ei < l e i , we may add M ei . e . = 
l e ., — l ei memory elements at node v to delay the symbols 
coming from ej such that the global kernel of the edge ej 
becomes 

f ej {z) = Z l '^ +1 E *We 4 (25) 
e 4 €r 7 (v) 

where l ej>max = max e . erj e , {v) l ei and K euej G ¥ q . Once 
this process of using memory at the node v results in the global 
kernel of every edge in To(v) to be a linear combination 
of symbols from the same generation (generations between 
different outgoing edges need not be the same), we say that 
single-generation processing has been achieved at node v. For 
a node T G T, we say that single-generation processing has 
been achieved at sink T if the condition (1) is satisfied along 
with condition (25) for each ej G Tq{T). 



Observation 2: We iteratively define the set V l cod C \> cod 
as the set of coding nodes which have path only from 

where V" od is as defined before. Once memory has been used 
to achieve single-generation processing at all nodes in V % c ~ d , 
it can be observed that the global kernels of the incoming 
and outgoing edges of any node v G V l cod satisfy the same 
condition as in (23) and (24). 

Thus again memory elements can be used at the nodes 
of V l cod to implement single-generation processing, ultimately 
achieving single-generation processing at each coding node of 
the network. 

B. Algorithm for single-generation network coding 

Algorithm 2 shown in the next page is used to achieve 
single-generation network coding using memory at the nodes 
of the network, while trying to minimize the total number of 
memory elements used in the network. 

Remark 2: Algorithm 2 assumes that every node has unlim- 
ited memory to use and then tries to obtain a configuration that 
reduces the number of memory elements used in the network. 
However, if the maximum available memory in the nodes is 
limited, then the following techniques may be adopted after 
running Algorithm 2. 

• In line 27 of the algorithm, instead of checking condition 
(22) at every pair of nodes connected by some edge, the 
actual memory capability of the nodes must be taken into 
account and then the distribution procedure of Subsection 
IV-C can be run. 
« Finally, at every node in which the algorithm demands 
more memory elements than what is available, sufficient 
memory elements should be removed so that the total 
memory used at the node is utmost what is available. As 
the penalty of removing these memory elements will be 
compensated by the sinks, the memory elements that will 
be removed at the nodes should ideally be such that the 
compensation occurs in the least number of sinks in the 
least possible quantity. 
Example 9: Fig. 10, Fig. 11, and Fig. 12 represent the 
network at various stages of the algorithm applied on a 
modified double-butterfly network as shown in Fig. 9. The 
modified unit-delay double-butterfly network shown in Fig. 10 
has the standard network code over F 2 . s is the source node, 
Ti,i — 1,2,3,4 are the sinks. The dotted lines represent the 
virtual input edges at the source and virtual output edges at 
the sinks. 

Table I shows the network transfer matrices before and after 
obtaining single-generation processing using Algorithm 2. 
Table I also shows a comparison between the memory require- 
ments at the sinks (for decoding) between inter-generation 
network coding (i.e the memory-free case; the numbers shown 
are the sum of the row degrees of realizable inverse matrices 
in the third column) and single-generation network coding (as 
shown in Fig. 12). In the memory-free case, assuming that 
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Input: A network Q m with delays and unused memory elements 

Output: The network Q m with a single-generation network code using memory elements at nodes 

1 foreach v € V co d in the ancestral order do 

2 Introduce sufficient memory elements at node v accordingly as in Subsection V-A in order to enable single-generation 
processing at node v. 

3 foreach e, e T^v) U f /(w) do 

4 j Run the memory reduction procedure as in Subsection IV-A. 

5 end 

6 end 

7 Now the global kernel of any edge ej G Ti(T) of any sink T is of the form 



f(z) = z L ^f e 



for some positive integer L Cj , with f e 
s foreach T e T do 

Add sufficient memory according to Subsection III-A and Subsection III-B such that single-generation processing is 

achieved at the sink T. 
10 end 

n foreach v € V in the reverse-ancestral order do 

12 foreach pair of edge-sets \T It (v), r , («)] € P v do 

13 if condition (19) is satisfied then 

14 Run the memory reduction procedure as in Subsubsection IV-B2. 
is end 

16 end 

17 end 

18 foreach v £ V in the reverse-ancestral order do 



foreach subset T' (v) C r (w) U f (w) do 
if condition (15) is satisfied then 

Run the memory reduction procedure as in Subsubsection IV-B1. 



end 



19 

20 
21 
22 
23 

24 end 

25 foreach v £ V in the ancestral order do 

26 
27 
28 
29 
30 

31 end 

32 foreach v € V in the ancestral order do 



end 



foreach ej e To(v) do 

if condition (22) is satisfied then 
| Run the memory distribution procedure at v as in Subsection IV-C. 
end 

end 



33 
34 
35 

36 

37 

38 

39 end 



foreach ej e T (w) U fo(«) do 
foreach G ^i,ej ( v ) do 

Update the corresponding elements in A, K, and B v matrices according to (2), (3), and (4) of Subsection 
III-A upon calculating M ei ^ ej . 

end 

Update the corresponding elements in A, K, and B v matrices according to (5), (6), and (7) of Subsection III-B 
upon calculating M e . MU ( e .y 

end 



Algorithm 2. Algorithm for using memory at nodes to obtain a single-generation network code 
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Fig. 9. Figure corresponding to Example 9. A modified double-butterfly network. The mapping between the incoming and outgoing symbols 
(ai, ct2, bi, b2, ci, C2 £ F2) at the nodes V4, T\, and vg are shown. 




Fig. 10. Figure corresponding to Example 9. After line 10 of Algorithm 2, single-generation network coding has been implemented in the network and all 
the sinks see a network transfer matrix as in (1). Each box indicates the presence of memory elements at the associated node. The way sink T\ uses memory 
is expanded below. Total memory used at this stage is 20. 
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Fig. 11. Figure corresponding to Example 9. The network after line 24 of the algorithm. Comparing this figure with Fig. 10, memory reduction according 
to Subsubsection IV-B1 has resulted in the 'absorption' of memory elements from the nodes V4, T\, vj, vg, and T4. Total memory used in the network now 
isL_L2. 




Fig. 12. Figure corresponding to Example 9. The network at the end of Algorithm 2. The 12 memory elements used in Fig. 1 1 are further distributed 
amongst the nodes of the network. 



sinks use memory individually to decode, the total number of 
memory elements used in the network is 19, and all of them 
are used at the sinks. In the single-generation network coded 
network as shown in Fig. 12, it can be seen that the total 
number of memory elements used in the network is 12, out of 
which only 7 are used at the sinks, thereby showing a marked 
reduction from the memory-free case. The rest of the memory 
elements (numbering 5) are distributed across the nodes of the 
network. 

C. Comparison with the approach of [5] 

We can compare the straightforward approach of [5] and 
our approach to obtaining a single-generation network coded 



network for the modified unit-delay double-butterfly network 
of Fig. 9. According to the technique in [5], the result would 
be the network as in Fig. 10, thereby resulting in the use of 20 
memory elements to obtain single-generation network coding. 
However, our algorithm utilizes the memory reduction and 
distribution techniques as given in Section IV and results in 
the output being as in Fig 12 using 12 memory elements and 
a more uniform distribution of memory elements across the 
network than in Fig. 10. Although the overall memory usage 
is reduced, it still remains to be shown whether Algorithm 2 
actually obtains a configuration of the network with minimal 
number of memory elements being used to obtain single- 
generation network coding. 
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TABLE I 

Comparing inter(memory-free) and single-generation network coding(using memory) for the network in Fig. 9 



Sink | Network transfer matrix 
before Algorithm 2 



Realizable decoding matrix 
obtained from M^jz) 



Network transfer matrix 
after Algorithm 2 



No. of memory 
elements used 
before Algorithm 2 



No. of memory 
elements used 
after Algorithm 2 



M Tl (z) 



z* 



1 



M Tl (z) 



1 1 
1 



1 



, z 3 0~ 

Mt 2 (z)={ a , 



T 2 



Pt 2 (z) 



M T2 (z) 



1 
1 1 



T 3 



M T3 (z) 



z° + z" 



Pt 3 (z) 



1 1 + Z 6 



M T3 (z) = z 9 



T 4 



M T4 (z) 



z° + z s 



P Ti (z) 



M T4 (z) 



1 1 
1 



VI. Impact of single-generation network coding 

ON NETWORK-ERROR CORRECTION 

A. Impact on encoding 

Construction of a CNECC: For details on the basics of con- 
volutional codes, we refer the reader to [6]. The construction 
of a CNECC [4] for a given acyclic, unit-delay, memory- 
free network which corrects error vectors corresponding to 
a given set $ of error patterns (an error pattern is a subset of 
£ indicating the edges in error) can be summarized as follows 

• Compute the set W s of error vector reflections given by 

W s = |J {wF T (z)p T (z)M-\z) \w€p} 

where w G F q 1 is an error vector, and w 6 p means 
that w matches an error pattern p. p T (z) 6 F 9 [z](the 
ring of polynomials) is some processing function chosen 
such that the processing matrix p T (z)M^ 1 (z) — Pr(z) 
is a polynomial matrix. 

• Let t s = max w s ( z -f eW s wh (w s (z)) . Choose an input 
convolutional code C s with free distance at least 2t s + 1 
as the CNECC for the given network. 

The following lemma gives a bound on t s and therefore the 
free distance demanded of the CNECC. 

Lemma 2 ( [4] ): Given an acyclic, unit-delay, memory- 
free network Q(V,£) with a given error pattern set <3>, let 
Tdeiay — 1 be the maximum degree of any polynomial in 
the F(z) matrix. Let wh indicate the Hamming weight over 
F q . If r is the maximum number of non-zero coefficients of 
the polynomials p T (z) corresponding to all sinks in T, i.e 
r = m&xxeT wh (p T (z)), then we have 

t s < rn [(n + 1) (T delay - 1) + 1] . 

Algorithm 2 does not increase the value of Tdeiay in the 
matrix F(z) because of the fact that an additional delay would 
not be introduced on any path between nodes which are at a 
distance of Tdeiay edges (the maximum number of edges on 
any path between any two nodes) from each other. Also, with 
memory being introduced in the nodes according to Algorithm 
2, the network transfer matrices at all the sinks are of the form 
as given in (1). Therefore the processing functions at any sink 
T is of the form p T (z) = z Lt , i.e r = 1. 

Therefore we have that, for the network with delay and 
memory (used to achieve single-generation network coding), 

t 8 <n[{n+ l)(T delay -!) + !]. 



Thus, it is seen that the bound for t s and therefore for the 
free distance demanded of the CNECC may be lower (if r > 1) 
for the unit-delay, single-generation network coded network 
compared to the unit-delay, memory-free counterpart. However 
a decrease in the actual value of t s cannot be guaranteed and 
has to be computed for every network individually in order 
to decide whether the CNECC designed for the unit-delay, 
memory-free network will continue to work for the single- 
generation network coded unit-delay counterpart. 

B. Impact on decoding 

Decoding of a CNECC: Let Gi(z) be the generator matrix 
of the code C s thus designed. Then we refer to the code C s as 
the input convolutional code [3]. The effective code seen by a 
sink T is generated by the matrix Go,t(z) = Gt(z)Mt(z), 
which is known as the output convolutional code [3], Co,t, 
at sink T. The decoding of the CNECC at any sink T can be 
performed either on the trellis of the code C s or that of the code 
Co,t at that particular sink according to the free distance of 
Co,t (dfree(Co,T)), the catastrophic/non-catastrophic nature 
of Go,t{z), and a parameter called Td free (Co,T), whose 
definition for a rate b/c code C over ¥ q is given in [3] as 
follows. 

T dfree (C):= max j + l (26) 
where Sd frcc [3] is defined as follows. 

S d S ree '■= { V [0.J) I W H (v[0,j)) < df r ee(C),CT Q = 0, V j > 0} 

where 

V[0,j) ■= [V0,V1, 

is a truncated codeword sequence with Vi <G F^), er t indicates 
the content of the delay elements in the encoder at a time t, 
and wh indicates the Hamming weight over ¥ q . The set Sd frcc 
consisting of all possible truncated codeword sequences U[o,j) 
of weight less than df ree (C) that start in the zero state. Then, 
we have the following proposition. 

Proposition 1 ( [3] ): The minimum Hamming weight trel- 
lis decoding algorithm can correct all error sequences which 
have the property that the Hamming weight of the error 
sequence in any consecutive (C) segments (a segment 

being a collection of c output symbols corresponding to every 
b input symbols) is utmost d s-r^(C)-\ 

With the CNECC in place in a unit-delay, memory-free 
network, under certain conditions (see Subsection IV-D of 
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[4]), a sink has to decode on the trellis of the input con- 
volutional code, in which case the sink has to multiply the 
incoming n output streams with the processing matrix Pt{z), 
which may require additional memory elements to implement. 
However, with a single-generation network code implemented 
using memory elements, part of this processing is done in a 
distributed manner in the other nodes of the network, thereby 
decreasing the memory requirement at the sinks. 

In the forthcoming section, we further observe the advan- 
tages that the use of memory in the intermediate nodes offers 
in the performance of CNECCs under a probabilistic error 
setting. 

VII. Simulation results 

A. A probabilistic error model 

Probabilistic error models have been considered in the con- 
text of random network coding in [7]. We define a probabilistic 
error model for a unit delay network G(V, £) by defining the 
probabilities of any set of i (i < \£\) edges of the network 
being in error at any given time instant. Across time instants, 
we assume that the network errors are i.i.d. according to this 
distribution. 

Prob. (i network edges being in error) = p % (27) 
Prob. (no edges are in error) = q (28) 

where 1 < i < \£\, and p, q < 1 are real numbers indicating 
the probability of any single edge error in the network and 
the probability of no edges in error respectively, such that 

B. Simulations on the modified butterfly network 

Fig. 13 on the top of the next page shows a modified 
butterfly network before and after running Algorithm 2. This 
network is clearly a part of the modified double-butterfly 
network of Fig. 9, and the associated matrices at the sinks 
Ti and T 2 are given in Table I. With the probability model as 
in (27) and (28) with |£| = 10 for this network, we simulate 
the performance of 3 input convolutional codes implemented 
on this network for both the with-memory and memory-free 
cases as in Fig. 13 with the sinks performing hard decision 
decoding on the trellis of the input convolutional code. 

In the following discussion we refer to sinks T\ and T 2 of 
Fig. 13 as Sink 1 and Sink 2. The 3 input convolutional codes 
and the rationality behind choosing them are given as follows. 

• Code Ci is generated by the generator matrix 

G h (z) = [l + z 1], 

with df ree (Ci) — 3 and T dfree (Ci) = 2. This code is 
chosen only to illustrate the error correcting capability of 
codes with low values of df ree (C) and T d}rcc {C). 

• Code C 2 is generated by the generator matrix 

G /2 (z)=[l + z 2 1 + z + z 2 ], 

with d free (C 2 ) = 5 and T dfm (C 2 ) = 6. This code cor- 
rects all double edge errors in the instantaneous version 



(with all edge delays and memories being zero) of Fig. 
13 as long as they are separated by 6 network uses. 
• Code C3 is generated by the generator matrix 

G h (z)= [1 + z + z 4 l + z 2 + z a + z 4 ], 

with d free (C 3 ) = 7 and T dfree (C 3 ) = 12. This code 
corrects all double edge errors in the unit-delay network 
given in Fig. 13 as long as they are separated by 12 
network uses. 

We note here that values of T dfree (C) of the 3 codes are 
directly proportional to their free distances, i.e, the code with 
greater free distance has higher T dfree (C). 

Fig. 14 and Fig. 15 illustrate the BERs for these 3 codes 
for both the with-memory and memory-free case for different 
values of the parameter p (the probability of a single edge 
error) of (27). Clearly the BER values fall with decreasing p. 

The description and explanation of the regions marked 
'df ree dominated region' and 'T dfree dominated region' 
(named so according to the dominant parameter in those 
regions) are given in [3]. In the following discussion, we 
concentrate on the comparison between the performance of 
every code in the memory-free and the with-memory case. 
Towards that end, we recall from Proposition 1 that both the 
Hamming weight of error events and the separation between 
any two consecutive error events are important to correct them. 

Performance improvement of CNECCs with memory at the 
intermediate nodes: 

1) With respect to codes C 2 and C 3 , we see that there is 
an improvement in performance when memory is used 
at the intermediate nodes. This is because of the fact 
that the presence of memory elements in the network 
results in a clumping-together of error bits at the sinks. 
For example, assume that in the network of Fig. 13, 
an error occurs in edge s — ► v\ at time instant t\. We 
consider the situation at Sink 2. In the memory-free case, 
the effect of this error is felt at different time instants at 
the two incoming edges of Sink 2, at t\ + 1 and at t\ + 
4. However, with memory elements at the intermediate 
nodes, the effects of the edge error now occur at the 
same time instant (ii + 4) in both the incoming edges 
of Sink 2. The effect of such errors cumulatively result 
in more error events (with less Hamming weights each) 
in the memory-free case (because of the distribution of 
errors) and less error events (with comparatively more 
Hamming weights each) in the with-memory case (as 
a result of clumped errors). However, because Codes 
C 2 and C 3 have enough free distance, the number of 
such error events is what dominates the performance. 
Therefore Codes C 2 and C3 correct more errors in the 
with-memory case. The same effect may be observed at 
Sink 1 also. 

2) With respect to the code C\, there is no observable 
change in performance between the memory-free and 
with-memory cases. We note that the same effect is 
observed with the errors as in the previous case. But 
because of T d}ree (C\) being less (only 2), the clumping 
together of error bits does not benefit much. Therefore 
there is no significant improvement in performance. 



14 




Fig. 13. A modified butterfly network 



Probability of single error error(p) vs BER at Sink 1 




Probability of single edge error (p) 

Fig. 14. BER (with and without memory) at Sink 1 
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Probability of single edge error (p) 

Fig. 15. BER (with and without memory) at Sink 2 



3) There is no significant difference in the performance 
of any code between the memory-free and the wifh- 
memory case in the 'df ree dominated region.' This is 
because of the fact that the errors that occur in the 
network are already sparse. 
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